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(57) ABSTRACT 

A user is enabled to navigate through an dectionic data l>ase 
in a pezsonalizcd maimer. A context is created based on a 
profile of the user, the profile being at least partly formed in 
advance. Candidate data is selected from the data base under 
control of the context and the user is enabled to interact with 
the candidates. The profile is based on topical information 
supplied by the user in advance and a history of previous 
accesses from the user to the data base. 
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CONTEXT- RASED AND USER-PROFILE the informatioD retrieved is not relevant to the user. Another 

DRIVEN INFORMATION RETRIEVAL drawback is that the known retrieval methods soppXy a set of 

results that is restricted to the literal search criteria entered 

FIELD OF THE INVENTION at that moment and not much else. That is, the electronic 

TV . 1 * . J J r ... 5 infonnalion retrieval docs Dot have the advantages of real- 

The invention relates to a method and system for cnabhng ^fc browsing at a bookstore where an inteiesting book cover 

retrieval of an informaUon item from an mformalion base m ^^^^ ^ ^^^.^ ^^^^ his/ha atlentbn or 

an electronic network. awaken his/her interest. Consequently, the infiMmatioo pro- 

BACKGROUND ART ^^^^ ^ unable to guide the user to other, yet rdatcd, works 

that could be of interest to this particular user. 

Rapidly expanding information archives provkle access to It is therefore an object of the inventwn is to provide a 

terabytes of electronic data, e.g., ekctronic museums, elec- method for retrieving information that improves the quality 

tronic newspapers, musical archives, digital Ubraries, soft- of the result data. 

ware archives, mailing lists, up-to-date weather information „, ™ « „ « 

and geographic data ConseJenUy, current advances in ^5 SUMMARY OF THE INVENTION 

information technology are driven by the need to increase To this end, the invention provides a method of enabling 

the efikctiveness of information access and retrieval. ^ to navigate through an electronic document base. The 

Traditionally, information providers try to overcome the invention provides a method of enabling a user to query an 

inadequacies of information retrieval by providing fost and electronic document base. The user siqiplics at least one 

powerful search engines, see, for example, U.S. Pat. No. 20 ^^^^ ^^^^ ^'S ' ^ * geometrical shape or pattern, a 

5,293,552 (PHN 13,666) herewtth incorporated by refer- tune or rhythm representing one or more bais of a piece 

cnce. Retrieval medianisms based on keywords typically method coinprises dcterminiag a topical 

return a large set of documents, but are not very precise in context for the query by means of extracting fam an access 

their return. Examples of searching ^sterns are commonly history, e.g, at least one preceding query, of the user to the 

available search engines, databases and Ubrary lookup sys- 25 ^o^unient base at least one concept object associated with 

terns. The user interacts with the system by providing a ^ current query. The concept object is used to create at 

query with sufficient information and gets back a set of |^ * profile. Then one or more doaiments are 

documents that more or less match the query identified in the dociuncnt base under control of the user 

Traditional approaches have devised mechanisms to map f^^J^l^^^^ ""^^'"^ "^"^ 

auser'squery to a document based on overlapping terms or ^ ^ . . ^? ^ . 

concept words between the query and the document tenns. .P® mvcnUon increases the effectiveness <rf browsing 

fr.""^- "1 T^^^ lentlTfe l&ely to be of interest to this specific user. TTie 

1996, Zundi, Switzerlarid. This approach uses a quantitative ^^y,^ fo, ^y^^ ^^^^^ is used to update the user's 

measure of semanUc sunflanty between mdex terms for p^fiie. The profile itself is used as a recommendation for 

queries and documents. ^ mapping relevant information fonn the information provid- 

Another recent method is described in "A Deductive Data cr's topic space, also referred to as document base, onto the 

Model for Query Expansion", Kalen^o Jarvelin, Jaana user's search space. 

Kristensen, Timo Niemi, Ecro Sormunen and Hcikki The profile gets updated dynamicaUy in re^wnse to the 

Keskustalo, Proceedings of the 19th Annual IntemaUonal user's interactions with the document base, Aocoidingly, Ibe 

ACM SIGIR Conference on Research and Development in ^3 dynamic part reflects the path taken within the provider's 

Information Retrieval, August 1996, Zurich, Switzerland. information space in the course of the user's search. 

This method introduces concept-based query expansion. Preferably, the profile has also a static part that reflects the 

where each concept is expanded to a disjunctive set of user's long-term interests. The term "statk:' is used to 

concepts on the basis of conceptual relaticMiships pointed out indicate a time scale substantially slower than that of the 

by the user. dynamic part. The static part is determined by, for example. 

Yet another known idea is proposed in ^^Incremental letting the user provide topical information about his/her 

Relevance Feedback for Information Filtering", James fields of attention the fiist time that the user interacts with 

Allan, Proceedings of the 19th Annual International ACM the document base. Such entries can be changed manually in 

SIGIR Conference on Research and Development in Infor- due course. Alternatively or subsidiarily, statistical analysis 

mation Retrieval August 1996, Zurich, Switzerland This 55 of a statistically relevant number of results over dme enables 

idea relates to relevance feedbadc techniques that process finding themes that stay substantially constant 

shifts in user interest patterns over a period of time. The user The preferred embodiment of the invention allows the 

feeds back notions of which query results he/she believes are user to retain a constant theme in his/her profile (static part) 

relevant to the current query. as well as to influence the profile by new issues (dynamic 

nnipr-r ni? tuc iMwnKmnia ^ P*^^^ generated white browsing the provider's informaUon 

utwtc 1 vt mh INVhNnON space. This latter aspect of the invention gives a mechanism 

A key to efifectivc information retrieval lies in mecha- lo information-providers to attract the user's interest while 

nisms that increase the precision values for documents Ihe latter is browsing at their sites, 

retrieved. One problem with existing search systems is that Preferably, the user is allowed to disable and enable the 

if the query is not very precise, the user is left with the task 65 static and/or dynamic part of his/her profile so as to be able 

of scanning through a large of amount of result data to to choose whether or not (o use the profiling in retrieving 

identify documents of interest, because a large percentage of information. 
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Thus, the inventioo enables clustering and re-clustering of selection, the keyword and one or more context keywords 
the information space in a manner effective for highly are entered into the search engine of document base 1Q2. If 
personalized browsing. The invention can be regarded as an static profile has a category "NOT*, i.e., of one or more 
automatic version of the **refine^ button as provided by topics to be excluded in advance from the search, the search 
various search engines found on the Internet. 5 engine is caused to execute the Boolean operation so as to 

For the example wth the music data base mentiooed ''Z"^ doouncnts that happen to comply lileraUy 

above, see patent .ppHctionSer. No. 08/840^56, filed ^ NOP oondiUons. 

Apr. 28, 1997 (PHA 23,241X herein incorporated by refer- . 'V* that document base 102 Kkntifies a large 

^ ' number oC documents that match the oombmation of the 

]0 words entered by the user within the context gpoerated by 
BRIEF DESCRIPTION OF THE DRAWINGS generator 108. The identifiers of these docnmenis are 

^ , . ... teturrwd to the user, for example in the formal used by the 

Tlie imrention is explamed by way of example and with PhoetSearch service of PhUips Hectronics at htip:// 
reference to the accompanying drawings, wherein FIG. 1 is www.planetsearcb.com/. whose search engine is described 
a diagram lUustrating the method of the invention; on U.S. pat. No. 5,293,552. The results in this format are 

PREFERRED EMBODIMENTC rcpiesented as ranked according to relevance, ^ the tela- 

tive contribution of each keyword to each spcdSc result is 

FIG. 1 is a diagram of a system 100 illustrating the indicated by a colored bar. The results of this qsery are also 
method according to the invention by way its main func- sent to an analyzer 110. Analyzer HO genentes a set of 
tionalities. System 100 has an electronic document base 102 20 concq>t keywords based on these residts. The generation 
and a user terminal or client 104 through which the user algorithm uses» for example, the topical partitioding of the 
interacts with document base 102. For example, client 104 information space of base 102 and a weighed topical 
coniprises an alphanumeric keyboard or a speech coder (not dictionary. Such algorithms are known in the ait These 
shown) and a display device (not shown). The user enters, in concept keywords are then stored in a memoiy 112 that 
this example, query words into system 100 through the ^5 represents the user's dynamic profile. If the user starts a new 
keyboard or speech coder and gpts visual feedback on query by entering one or more new key wotds^c^ based on 
hi^er entry and the query results as explained below. die results returned in the previous query, a siimlar proce- 

System 100 comprises a static profile memory 106 that dure as outlined above is followed. Ihe difEercaoe now is 
stores indications of what represents this individual user's that the content of memory 112 is beiotg taken into account 
long-term interests. For cxan^>le, the user has provided 30 ^ ^ order to determine the context The content of 
topics that represent her/his main fields of interest upon memory 112 thus indicates the path taken by the user while 
being introduced to system 100 for the first time. browsing the information space of document base 102. 
Alternatively, it for example, the user's cultural and social The user may change his/her fiocus of interest during 
background and profession are known, the system may his/her interactionwith document base 102. If the user enters 
assign by de&ult this particular user to a particular category 35 the next time one or more query words that relate to a topic 
typical of this type of user. Alternatively, or subsidiarily, the that bears no relation to the context of the preoedii^ query, 
user may specify that she/he is definitely NOT interested in system 100 detects a context shift. Context shifts are being 
specific topics so as to be able to exclude certain categories nK>nitored and are used to change the user's dynamic profile 
of docuincnts right from the outset. All this information 112 in order to modify the context of the previous queries, 
contributes to creating a bng-term profile of this user which 40 Upon a context shift, dynamic profile 112 does mitially not 
is stored in memory 106. affect the query, as there are no concept words stored that 

It is assumed that the user is interacting with system 100 relate to the new topic, 
for the first time and enters a query word through client 104. The above is illustrated by the foDowing examples. 
System 100 now enables interpreting this query word within Assume that the user has been interacting with system 100 
a certain context that is determined by the static profile as 45 using in succession the query words "dining", "rccqjcs**, 
stored in memory 106. System 100 has a context generator "curry". The context derived from these entries is "cooking" 
108 that generates one or naore additional keywords asso- or "food preparation". If the user now enters keywords 
ciated with the topic under consideration as given by the "processor" and "micro", the dynamic part of the profile lets 
user's eiitry. This is done, for example, via an algorithm that these tenns be interpreted as "food processor" and "mkro- 
is based on a topical partitioning of the information space 50 wave oven", respectively, aixl identifies documents relating 
spanned by the documents in document base 102. to the latter issues. Had the user been interacting with system 
Alternatively, the keyword entered through client 104 is 100 using, e.g., "parallel", "computing" and "algorithms", 
mapped onto semantically similar terms in a dictionary. The the same terms "processor" and "micro" would have been 
mapping is controlled by static profile 106 to ehminate interpreted as "data processor" or "signal processor^' and 
unrelated topics. For example, the entries "processor" and 55 "microprocessor" within the context established by the 
"micro" can be mapped onto the topic "computers" via dynamic profile as relating to "data processing* and "com- 
"microprocessors", but also onto the topic "cooking" via pulers^. 

"food processor" and "microwave oven". If the user is a As another example, assume that the user is initially 
rabid amateur cook with much too little time because she/he interested in ideas on how to invest money. A user looking 
is a very busy specialist in parallel data processing 60 for a document on "investing" may have started off by 
architectures, both topics may be relevant and the context entering via client 104 the keywords "investments" and 
should include both. If the static profile indicates that the "banking" into system 100. System 100 processes these 
user is only interested in one of these categories, the context terms and retrieves documents that match this query. System 
should cause documents of the other category to be ignored . 100 returns results to ch'ent 104 that represent the documents 
If the static profile comprises neither indication, the context 65 retrieved. The queries are then enhanced by adding and 
should permit documents of both categories to be retrieved dropping a few keywords. The user browses tbough these 
if present in document base 102. In order to achieve this results and gets attracted to the idea of on-line banking. In 
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the next query the user adds a term ''oo-line" and queries 
system 100 anew. This leads towards articles about a bill pay 
system of a particular bank and the query is further enhanced 
by adding the term *^pay biU**. After arriving at a desired 
result^ the user either quits the search, or shifts interest to 5 
another topic altogether, say, "computer networking archi- 
tectures'*. This is referred to as a context shift* a shift in the 
query that indicates a change of interest. By understanding 
such context shifts it is possible to narrow the user's search 
path. For example, a search for "ATM'' could imply either lO 
information regarding "asynchronous transmission noode 
networking protooor or "Automated Teller Machines'". 
V^thin the context of banking, the term "ATM" would have 
led to documents on "Automated Teller Machines". Within 
the context of "networking archilcctuiesT, the term ATM is 
now leads to documents concerned with "asynchroDOUS 
transmission mode''. 

A context shift is detected using the generation algorithm 
mentioned above that uses a topical partitioning of the 
information q>aoe of document base 102 and a weighted ^ 
topical dictionary. IC for example, the distance between a 
newly entered keyword and the keywords representing the 
current dynamic profile in memory 112 is too large, it is safe 
to assume that there is a context shift. This distance is 
obtained, for example, by computing a degree of overlap ^ 
between successive query terms. The query tenns used to 
compute the shift include the terms added on by analyzer 
HO. The larger the overlap, the higher the probability that 
the query takes place within the same context if there is rK> 
overly it ts safe to assume that there is a context shift. When ^ 
a context shift is detected, system 100 automatically maps 
the user's queries to another part of the topical information 
space. At the same time, system 100 continues to build up an 
access history as the user now browses a different part of 
document base 102. 

I claim: 

1. A method of enabling a user to query an electronic 
document base system, the method compri^ng the steps of: 
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forming a user profile of the user containing a dynamic list 

of concept objects extracted by the system from an 

access history of the user, 
entering at least one query object by the user into the 

system for a current query; and 
performing the following steps by the system: 

determining a topical context for the current query from 
the at least one query object and the user profile and 
generating at least one context object for the current 
query representing the determined topical context; 

identifying at least one document in the document base 
under control of the at least one query object and the 
at least one generated context object; and 

updatirig the access history of the user based on the 
current query. 

2. The method of claim 1, wbeiem the access history is 
formed by logging queries and other interactions of the user 
and the system. 

3. The method of claim 1, wherein the step of updating the 
access history comprises the steps o£ 

generating a further c(»Kxpt object based on the current 
query; 

verifying if the further concept object is absent from the 
user profile; 

storing the further corK:ept object in the user profile if the 

fiirther conc^t object was absent; and 
skipping the storiiig step if the further concept object was 

present in the user profile. 

4. The method of claim 1, comprising enabling the user to 
specify a part of the user profile representing a profile of the 
user's interests. 

5. The method of claim 4, wherein the user is enabled to 
specify unwanted topics in the user profile and documents 
identified by the specified unwanted topics are excluded 
from being made available to the user. 

^ * * * » 



